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Abstract — One of the most common types of functions in 
mathematics, physics, and engineering is a sum of products, 
sometimes called a partition function. After "normalization," a 
sum of products has a natural graphical representation, called a 
normal factor graph (NFG), in which vertices represent factors, 
edges represent internal variables, and half-edges represent the 
external variables of the partition function. In physics, so-called 
trace diagrams share similar features. 

We believe that the conceptual framework of representing 
sums of products as partition functions of NFGs is an important 
and intuitive paradigm that, surprisingly, does not seem to have 
been introduced explicitly in the previous factor graph literature. 

Of particular interest are NFG modifications that leave the 
partition function invariant. A simple subclass of such NFG 
modifications offers a unifying view of the Fourier transform, 
tree-based reparameterization, loop calculus, and the Legendre 
transform. 

I. Introduction 

Functions that can be expressed as sums of products are 
ubiquitous in mathematics, science, and engineering. Borrow- 
ing a physics term, we call such a function a partition function. 

In this paper, we will represent partition functions by normal 
factor graphs (NFGs), which build on the concepts of factor 
graphs [14 1 and normal graphs [12]. A factor graph represents 
a product of factors by a bipartite graph, in which one set 
of vertices represents variables, while the other set of vertices 
represents factors. By introducing "normal" degree restrictions 
as in [12], we can represent a sum of products by an NFG in 
which edges represent variables and vertices represent factors. 
Moreover, internal and external variables are distinguished in 
an NFG by being represented by edges of degree 2 and degree 
1, respectively. NFGs closely resemble the "Fomey-style factor 
graphs" (FFGs) of Loeliger et al. ifTSl . lfT6l . with the difference 
that "closing the box" (summing over internal variables) is 
always explicitly assumed as part of the graph semantics. 

There are as many applications of NFGs as there are of sums 
of products. In this paper, we will present several applications 
that highlight the usefulness of the graphical approach: 

• Trace diagrams, which are closely related to NFGs, 
often provide insight into linear algebraic relations, par- 
ticularly of the kind that arise in various areas of physics; 

• The sum-product algorithm is naturally nicely derived 
in terms of NFGs; 

• The normal factor graph duality theorem (0, |[T3) is 
a powerful general result, of which one corollary is the 
normal graph duality theorem of [il21 . 



• The holographic transformations of NFGs of Al- 
Bashabsheh and Mao [2|, which may be used to derive 
the "holographic algorithms" of Valiant [21 J and others, 
may be further generalized to derive the "tree-based 
reparameterization" approach of Wainwright et al. f25\, 
the "loop calculus" results of Chertkov and Chemyak 
Q, ID, and the Lagrange duality results of Vontobel and 
Loehger ED, G?!. 

• Linear codes defined on graphs and their weight gen- 
erating functions have natural representations as NFGs, 
as shown in [13], but we will not discuss this topic here. 

II. Partition Functions and Graphs 

A partition function is any function Z{x) that is given in 
"sum-of-products form," as follows: 

where 

• X is a set of m external variables Xi taking values Xi 
in alphabets < i < m; 

• Y is a set of n internal variables Yj taking values yj 
in alphabets Vi,! < j < n; 

m each factor fk{xk,yk), k G /C, is a function of certain 
subsets Xfc C X and C Y of the sets of external and 
internal variables, respectively. 

The set X — YliLi of all possible external variable 
configurations is called the domain of the partition function, 
and the set y = YVj=i °f possible internal variable 
configurations is called its configuration space. We say that a 
factor fk{^k,yk) involves a variable Xi (resp. Yj) if fk is a 
function of that variable; i.e., if Xi G X^ (resp. Yj G Y^). 
For simplicity, we will assume that all functions are complex- 
valued, and that all variable alphabets are discrete^] 

A particular sum-of-products form for a partition function 
will be called a realization. Different realizations that yield 
the same partition function Z : X ^ C will be called 
equivalent. We say that equivalent reaUzations preserve the 
partition function. 

' Usually in physics a partition function is a sum over internal configurations 
(state configurations), and there are no external variables in our sense 
(although there may be parameters, such as temperature). So our usage of 
"partition function" extends the usual terminology of physics. Al-Bashabsheh 
and Mao 1 2 1 use the term "exterior function." 



A. Normal partition functions 

We will say that a realization of a partition function is 
normal if all external variables are involved in precisely one 
factor fk, and all internal variables are involved in precisely 
two factors. These degree restrictions were introduced in [fT2l 
in the context of behavioral graphs. 

As observed in [12J, any realization may be converted 
to an equivalent normal realization by the following simple 
normalization procedure. 

« For every external variable Xi, if Xi is involved in p 
factors, then define p replica variables Xu, I < £ < p, 
replace Xi by Xi^ in the £th factor in which Xi is in- 
volved, and introduce one new factor, namely an equality 
indicator function {xn, 1 < i < p}) (see below). 

• For every internal variable Yj, if Yj is involved in q > 2 
factors, then define q replica variables Yji,l < £ < q, 
replace Yj by Yje in the £th factor in which Yj is in- 
volved, and introduce one new factor, namely an equality 
indicator function ^={{yje, 1 < £ < q})- 

Thus all replica variables are internal variables that are in- 
volved in precisely two factors, while the external variables 
Xi become involved in only one factor, namely an equality 
indicator function. Evidently this normalization procedure 
preserves the partition function. 

B. Normal factor graphs 

For a normal realization of a partition function, a natural 
graphical model is a normal factor graph (NFG), in which 
vertices are associated with factors, ordinary edges {i.e., hy- 
peredges of degree 2) are associated with internal variables, 
"half-edges" llT2l {i.e., hyperedges of degree 1) are associated 
with external variables, and a variable edge or half-edge is 
incident on a factor vertex if the variable is involved in that 
factor 

Example 1 (vector-matrix multiplication). Consider a multi- 
plication V = -wM of a vector w by a matrix A/, namely 



for some discrete index sets I and J . This may be interpreted 
as a normal realization of the function v : J ^ with 
external variable J, internal variable /, and factors Wi and 
Mij. Figure [U shows the corresponding normal factor graph, 
in which the vertices are represented by labeled boxes, and the 
half-edge is represented by a special dongle symbol^ □ 
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Fig. 1. Normal factor graph of a matrix multiplication v = wM. 



^The dongle symbol "H" was chosen in |12| to suggest the possibility of 
a connection to another external half-edge in the manner of two railroad cars 
coupling, but of course this embellishment may be omitted. 



C. Equality indicator functions 

We use special symbols for certain frequently occurring 
factors. The most common and fundamental factor is the 
equality indicator function $=, which equals 1 if all incident 
variables (which must have a common alphabet) are equal, and 
equals otherwise. 

Figure |2] shows three ways of representing an equality 
indicator function: first, by a vertex labeled by second, by 
a vertex labeled simply by an equality sign and third, as a 
junction vertex. The second representation makes a connection 
with the behavioral graph literature {e.g.. Tanner graphs), 
where vertices represent constraints rather than factors. The 
third representation makes connections with ordinary block 
diagrams, where any number of edges representing the same 
variable may meet at a junction, as well as with the factor 
graph literature, where variables are represented by vertices 
rather than by edges. 



-4^ 



Fig. 2. Three representations of an equality indicator function of degree 3. 

An equality indicator function of degree 2 is often denoted 
by a Kronecker delta function 6. Since such a function con- 
nects only two edges and constrains their respective variables 
to be equal, it may simply be omitted, as shown in Figure Ist^ 



Fig. 3. Three representations of an equality indicator function of degree 2. 



III. Trace Diagrams 

It turns out that physicists have long used graphical dia- 
grams called "trace diagrams" flOl, [TT], [TS], [19], |20| that 
use semantics similar to those of NFGs. In this section we 
give a brief exposition of this topic, following 1191 . 

In trace diagrams, the factors are often vectors, matrices, 
tensors, and so forth, and the variables are typically their 
indices. For instance, a matrix M = {Mij,i e 2,j G J'} 
may be considered to be a function of the two variables / and 
J, and is represented as a vertex with two incident edges, as 
in Figure Ufa). 



M 



J 






I 


M 









Fig. 4. 



(a) (b) 

Representations of (a) a matrix M ; (b) the trace of M. 



'The last equivalence shown in Figure[3]is actually a bit problematic, since a 
single edge is not a legitimate normal factor graph; however, as a component 
of a normal factor graph, such an edge is always incident on some factor 
vertex /j., and since the combination of a factor involving some internal 
variable Yj with an equality function , y'^) is just the same factor with 

Yj substituted for Yj, this substitution can be made in any legitimate NFG 
(see also |2J). 



Trace diagrams use the NFG convention that dangling edges 
(half-edges) represent external variables, whereas ordinary 
edges represent internal variables, and are to be summed over 
For example, if the matrix M is square (i.e., the index alpha- 
bets I and J are the same), and the half-edges representing 
/ and J are connected as in Figure |4|b), then the resulting 
figure represents the trace of M, since Tr M = Mu. This 
apparently explains why these kinds of graphical models are 
known as "trace diagrams." 

The convention that indices that appear twice are implicitly 
to be summed over is known in physics as the Einstein sum- 
mation convention. This convention is used rather generally in 
physics, not just with trace diagrams. 

Trace diagrams permit visual proofs of various relationships 
in linear algebra. For example. Figure |5] proves the identity 
Tr ABC = Tr BCA. 
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Fig. 5. Proof of the identity Tr ABC = Tr BCA. 

If u and V are two real vectors with a common index set 
I, then their dot product (inner product) is defined as 



U • V = ^ UiVi 



iex 



The trace diagram (or normal factor graph) of a dot product 
is illustrated in Figure HJa). 
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(a) (b) 

Fig. 6. Representations of (a) a dot product u ■ v; (b) a cross product u X v. 

If u and V are two real three-dimensional vectors, then their 
cross product u x v = w is defined by 

Wi ^ U2V3 - U3V2; 
W2 = U3V1 - U1V3] 
W3 = U1V2 ~ U2V1. 



Equivalently, 



Wi 



J=l fc=l 



where we use the Levi-Civita symbol £ijk, defined as 

-1, if ijk is an even permutation of 123; 
Eijk = < —1, if ijk is an odd permutation of 123; 
0, otherwise. 



Thus w is given in the form of a normal partition function 
with external variable / and internal variables J and K. The 
trace diagram or NFG of this cross product is illustrated in 
Figure |6lb). (Notice that in this case the order of the indices 
is important, since Eijk = —£jik-) 

Similarly, the determinant of a 3 x 3 matrix M may be 
written in terms of e^jk as 

333 

det M = ^ ^ ^ e^jkMuM2jM3k- 

1=1 j=l k=l 

Thus if Mi,M2 and M3 are the three rows of M, then its 
determinant may be represented in trace diagram or normal 
factor graph notation as in Figure |7] 
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Fig. 7. Representation of a detenninant det{Mi, M2, M3}. 

Figure I2] shows that the determinant of M may be expressed 
in three equivalent ways, as follows: 

detM = Ml • (M2 X M3) 
= M2 • (M3 X Ml) 
= M3 • (Ml X M2). 

The trace diagram notation permits other operations that 
have not heretofore been considered in the factor graph liter- 
ature. For example, two trace diagrams with the same sets of 
external variables that are connected by a plus or minus sign 
represent the sum or difference of the corresponding partition 
functions^ For example. Figure |8] illustrates the "contracted 
epsilon identity," namely 
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Fig. 8. Contracted epsilon identity. 



From this identity, or its corresponding trace diagram, we 
can derive such identities as 

(u X v) X w = (u • w)v — (v • w)u, 

illustrated in Figure |9ja), or 

(u X v) ■ (w X x) = (u • w)(v • x) — (u • x)(v ■ w), 

illustrated in Figure lUb), which reduce expressions involving 
two cross products to simpler forms involving only dot prod- 
ucts. 

"^A product of partition functions is represented simply by a disconnected 
factor graph, with each component graph representing a component function. 



(a) 



^^^^^ 

(b) 

Fig. 9. Cross product identities: (a) (u X v) X w = (u • w)v — (v ■ w)u; 
(b) (u X v) ■ (w X x) = (u ■ w)(v ■ x) — (u ■ x)(v • w). 



IV. The sum-product algorithm 

The sum-product algorithm is an efficient method for com- 
puting partition functions of cycle-free graphs. It has been 
explained many times, including in lfT2ll . Here we explain it 
again in the language of normal factor graphs, with the objec- 
tive of achieving a clearer and more intuitive explanation than 
in m. We freely use ideas from e.g., [I] , [H] , [B] , [E] , [26j . 

As Al-Bashabsheh and Mao Q have emphasized, a partition 
function is completely determined by the set {fk{^k,yk)} of 
factors, independent of their ordering. In evaluating a partition 
function, factors may be arbitrarily ordered and grouped. This 
observation (called the "generalized distributive law" by Aji 
and McEliece 1 1 1) is at the root of the sum-product algorithm. 

We start with a normal realization of a partition function 
with no external variables whose associated normal graph Q 
is connected and cycle-free. Thus the partition function of Q 
is a constant, denoted by Z{Q), and Q is an ordinary graph 
(no half-edges) that moreover is a tree. 

A connected graph Q is cycle-free if and only if any cut 
through any edge Yj divides Q into two disconnected graphs, 
which we label arbitrarily as and ^j. Such a cut divides the 
edge associated with Yj into two half-edges associated with 
two external variables, denoted by % and %, with the same 
alphabet as Yj, as illustrated in Figure [TO] 
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Fig. 10. Disconnecting a cycle-free NFG 5 by a cut through edge Yj. 

Let US define the messages 'jijivj) and t*j(yj) as the 
partition functions of '^j and ^j, respectively; i.e., 

T^jivj) = XI n Myk), 

where ^ is the set of left-side variables (excluding Yj), and 
^ is the set of indices of left-side factors, and similarly for 
%j{yj). The goal of the sum-product algorithm is to compute 
the messages 'jtj {yj ) , %j {yj ) for every internal variable Yj . 

To compute a message such as ~]ij{yj), consider the factor 
vertex to which 1^ is attached. For simplicity, let us suppose 
that this vertex has degree 3, and that the associated factor is 
fiVj , Vj' , Vj" ), as shown in Figure E] 

Since Q is cycle-free, the subgraphs Qji and Qjn that 
extend from the edges Yji and Yjn must be disjoint. Their 
partition functions, 'j}j>{yjr) and ~jljii{yjii), include all factors 
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Fig. 1 1 . Expressing an NFG in terms of subgraphs connected to a vertex. 

in 'jtj{yj) except f{yj,yji,yj"), and sum over all internal 
variables except Yji and Yj/'. Therefore the partition function 
~iij{yj) of ~^j may be expressed in terms of the partition 
functions of these subgraphs as follows: 

tjiyj)^ H fiyyyo'^y3")~t^3'iyo')l^o"iyo")- 

More generally, if the factor vertex to which edge Yj is 
attached is fk{yk), then the message update rule is 

t]{y])^ f'^^yi'^ n '^r(yj')- 

yk\{yj} j'eJk\{j} 

This is called the sum-product update rule. 

Since Q is connected and cycle-free, it is a tree (assuming 
that it is finite). Each message 'Jij has a depth equal to the 
maximum length of any path from that message to any leaf 
vertex. The messages at depth 1 can be computed immediately, 
the messages at depth 2 can be computed as soon as the 
messages at depth 1 are known, and so forth. If Q is finite, then 
all messages can be computed in at most S{Q) rounds, where 
6{Q) is the maximum possible depth, called the diameter. 

For any internal variable Yj, we define the marginal par- 
tition function Zj{yj) as 

(yj ) = T^j (% ) % (yj ) , yj g ■ 

Thus Zj{yj) is simply the componentwise (dot) product of 
the messages ~jtj{yj) and ^j{yj). This is sometimes called 
the past-future decomposition rule 11121 . 

Graphically, Zj{yj) is the partition function of the graph 
obtained from Q by converting Yj from an internal to an 
external variable as shown in Figure [12] i.e., by replacing 
the edge associated with Yj by a "tap" consisting of the 
concatenation of an edge labeled by 1^, an equality indicator 
function, and another edge labeled by Yj, with a further half- 
edge labeled by Yj attached to the equality indicator function. 
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Fig. 12. Converting Yj from internal to extemal by inserting a "tap." 

Conversely, Z{Q) is the partition function of the graph 
obtained by converting Yj back to an internal variable; i.e., 
by summing Zj{yj) over Yj-. 

z{g)^ H Zj^vj)^ Y ~^Ayi)%iyj)- 

Thus, for any edge Yj, Z{Q) is simply the dot product of the 
messages ~^ — ' ^ 



~jlj and )ij. 



V. Holographic Transformations 

In this section, we recapitulate and generalize the concept 
of "holographic transformations" of normal factor graphs, 
which was introduced by Al-Bashabsheh and Mao |2l, and 
their "generalized Holant theorem," which relates the partition 
function of a normal factor graph to that of its holographic 
transform. This theorem generalizes the Holant theorem of 
VaHant 121J (see also H, 0, Q, ©, ||22]), which has 
been used to show that some seemingly intractable counting 
problems on graphs are in fact tractable. 

Using this concept, Al-Bashabsheh and Mao [2] were able 
to prove a very general and powerful Fourier transform duality 
theorem for normal factor graphs, of which the original normal 
graph duality theorem of |12| is an immediate corollary. We 
give a variation of this proof which is perhaps even simpler 
(compare also the proof in ||T3l ). 

In the last section of this paper, we will sketch further 
applications of this general approach. 

A. General approach 

The general approach can be explained very simply, as 
follows. Let A and B be two finite alphabets, which will often 
be of the same size; i.e., \A\ = \B\. Let U{a,b), S{b,b'), 
and V{b',a') be complex-valued factors involving variables 
A, B, B', and A' defined on A, B, B, and A, respectively; 
alternatively, we may regard U, S, and V as matrices. Finally, 
suppose that the concatenation USV, shown in Figure [13] is 
the identity factor daa', which can be represented simply as 
an ordinary edge as in Figure [30 
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Fig. 13. A concatenation of factors tliat is equivalent to the identity. 

We then have the following obvious lemma: 

Lemma (generalized holographic transformations). In any 
NFG, any ordinary edge may be replaced by a concatenation 
of factors USV equivalent to the identity, as in Figure [T3] 
without changing the partition function. □ 

The "holographic transformations" of [2 1 involve similar re- 
placements, except without the middle factor S (alternatively, 
with S{b,b') = Sbb')- Al-Bashabsheh and Mao [2| call B the 
coupling alphabet, and say that U and V are dual with respect 
to B. Wlien |^| — \B\, they say that U and V are transformers; 
in this case, as matrices, U and V are inverses. 

If a normal factor graph has external variables Xi, then 
they may be transformed as well, by the insertion of a factor 
or matrix Wi{xi,Wi) defined on Xi x Wi, where Wi is the 
alphabet of a transformed external variable Wi. Thus the 
partition function is transformed into a function of the new 
external variables Wi. This is the essence of the "generalized 
Holant theorem" of f2\. (The original Holant theorem of 
Valiant 121 1 applies when there are no external variables.) 

'Here and subsequently we may label an internal edge simply by its 
alphabet, without introducing dummy internal variables. 



B. General normal factor graph duality theorem 

This general approach yields a very simple proof of the 
"general normal factor graph duality theorem" of f2l, [T3]. 

Suppose that we have a normal factor graph in which each 
variable alphabet ^ is a finite-dimensional vector space over 
a finite field F of characteristic p (i.e., p is the least positive 
integer such that pa = for all a G F). The dual space A is 
then a vector space over F of the same dimension as A, and 
there is a well-defined Zp-valued inner product (a, a) with 
the usual properties; e.g., (a,0) = (0,a) = 0, (a,a + a') = 
{a, a) + {a, a'), and so forth (see, e.g., ifTTI ). 

Given a complex-valued function f : A ^ C defined on A, 
its Fourier transform is then defined as the complex-valued 
function F : A ^ C on A that maps a to 



Fia) 



where w — q^'^^/p is a primitive complex pth root of unity. 

In an NFG, a Fourier transform may be represented as in 
Figure [14] where the Fourier transform factor is 

= {uj(^'^) -.aeA^aeA}. 

The transform F{a) is obtained by summing over A, which 
in this case amounts to a matrix-vector multiplication. 
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Fig. 14. Normal factor graph of a Fourier transfomi. 

Note that as a factor in an NFG, we do not have to 
distinguish between Ta and its transpose; Ta is simply a 
function of the two variables corresponding to the two incident 
edges, and as a matrix can act on either variable. Thus Ta 
can act also as a Fourier transform Tj^ on a function of A. 

More generally, given a complex-valued multivariate func- 
tion /(a) defined on a set of variables A ~ {^i} whose 
alphabets Ai are vector spaces over F, its Fourier transform 
is defined as the complex-valued function 



In other words, in a normal factor graph, each variable Ai may 
be transformed separately, as illustrated in Figure [15] In [[2l, 
this property is called separability. . „ 
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Fig. 15. Fourier transform of multivariate function /(ai, 02,03). 

Now let us define U = V ^ Ta and S = ^r^l\A\, where 
the sign inverter indicator function over A is defined as 

1, if a = —a'; 
0, otherwise. 



4'^(a, a') 



Then the concatenation USV is the identity, since 



E 



by a basic orthogonality relation for Fourier transforms over 
finite groups (see, e.g., |11J). This result is illustrated in 
Figure [16] where we omit the scale factor of |^|. 
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Fig. 16. A concatenation of factors that is equivalent to an edge, up to scale. 

Now we can prove our desired result: 

Normal factor graph duality theorem f2l, fOl. Given 
an NFG with partition function Z(x), comprising external 
variables Xi associated with half-edges, internal variables Yj 
associated with ordinary edges (all alphabets being vector 
spaces over a finite field F), and complex-valued factors 
associated with vertices, the dual normal factor graph is 
defined by replacing each alphabet Xi or yj by its dual 
alphabet Xi or y,, each factor by its Fourier transform 
/fc, and finally by placing a sign inverter indicator function 
<I>^ in the middle of every ordinary edge. Then the partition 
function of the dual NFG is the Fourier transform Z(x) of 
Z(x), up to scaled □ 

Proof: Let us first convert the given NFG with partition func- 
tion 2'(x) to an NFG with partition function Z(x), up to scale, 
by appending a Fourier transform Txi from Xi to Xi to every 
half-edge associated with every external variable Xi, as in 
Figure [15] Then let us replace every ordinary edge associated 
with every internal variable Yj by a concatenation Ta'^^^a 
Uke that shown in Figure [161 this preserves the partition 
function Z(x), up to scale. Now each vertex associated with 
each factor ]k is surrounded by Fourier transforms of all of the 
variables involved in /fe, so it and its surrounding transforms 
may be replaced by a single vertex representing the Fourier 
transform factor without changing the partition function, 
up to scale. □ 

Notice that this remarkably general theorem applies to any 
normal factor graph, whether or not it has cycles. 

Using the fact that the indicator functions of a linear code C 
over F and of its orthogonal code are a Fourier transform 
pair, up to scale, one obtains as an immediately corollary a 
duality theorem for normal factor graph representations of 
linear codes |[T3l . which is equivalent to the original 
normal graph duality theorem of |[T2l . 

VI. Further Developments 

We now sketch briefly how the "tree-based reparameteriza- 
tion" approach of Wainwright et al. Il25l . the "loop calculus" 
results of Chertkov and Chernyak ||7|, H, and the Lagrange 
duality results of Vontobel and Loeliger li23l . ll24l fit within 
this generalized framework. The full developments will appear 
in a subsequent version of this paper 

*As shown in |2|, the scale factor is \y\. 



A. Tree-based reparameterization 

Wainwright, Jaakkola, and Willsky li25l have shown how 
the sum-product algorithm applied to general graphs with 
cycles can be understood as a tree-based reparameterization 
algorithm, where each round of the message-passing algorithm 
reparameterizes marginal distributions over simple subtrees 
consisting of a pair of vertices connected by an edge. More 
generally, they consider iterative algorithms that reparameter- 
ize distributions over arbitrary cycle-free subtrees of the graph, 
particularly spanning trees. 

Let X be a set of m variables Xi taking values Xi in finite 
alphabets Xi, and let i? be a set of pairs {Xi,Xj) indicating 
which pairs of variables are connected. Suppose that the 
corresponding graph with vertices Xi and edges {Xi, Xj) ^ E 
is a tree (i.e., cycle-free). Finally, suppose that a probability 
distribution p(x) over these variables can be expressed as 



p(x) oc Y\_ V'i(a^i) Y\. '^i]{xi,Xj), 

l<i<m (Xi,Xj)eE 



where the functions i!i{xi) and ipij{xi, Xj) depend only on the 
singleton variables Xi and pairs {Xi,Xj), respectively. (By 
the Hammersley-Clifford theorem, this can always be done 
when p(x) is a positive Markov random field over the graph.) 

We can view such a distribution p(x) as a partition function 
in which all variables are external (a "global function"). 
Normalizing this partition function, we obtain an equivalent 
partition function with the same external variables, but with 
an equality indicator function corresponding to each external 
variable replacing it in the corresponding normal factor graph. 
A typical fragment of such an NFG is shown in Figure [TT] 
X,, X, 



— 



Fig. 17. Fragment of NFG representing a probability distribution on a tree. 

Now we can execute the sum-product algorithm on such 
a cycle-free NFG, obtaining on each edge two messages, 
say ~fii{xi) and '^i{xi) on an edge with alphabet Xi. The 
corresponding marginal probability distribution Pi{xi) is pro- 
portional to the componentwise product of these messages: 

Pi{xi) oc 'jti{xi)%{xi),x^ e Xi. 

Such a marginal distribution can be exhibited explicitly as 
a message in a "reparameterized" NFG by replacing a factor 
such as tpij{xi,Xj) by the concatenation of three factors: 

U{xi,x'i) = %{xi)S{xi,x'i); 

^^j{x[,x'^) 



S{xi, X -) 



V{xj,x'j) = 'jtj{xj)S{xj,x'j), 
which evidently preserves the partition function. 



Such a reparameterization can be performed also in a graph 
with cycles, or over a subtree of a given graph. Nice results 
are obtained when the messages are those that occur at a fixed 
point of the sum-product algorithm, but the messages do not 
have to be chosen in this way. 

In future work, we plan to use this approach to restate and 
generalize many of the results of ||25l and related papers. 



B. Loop calculus 

Chertkov and Chernyak Q, (U, lH) have developed a "loop 
calculus" for statistical systems defined on finite graphs that 
allows the partition function of a system to be expressed as a 
finite sum over "generalized loops," in which the lowest-order 
term corresponds to the Bethe-Peierls (sum-product algorithm) 
approximation. 

We briefly sketch our approach to their results. Suppose 
that all alphabets are binary. Then replace every edge Yj in the 
system by the concatenation UjSjVj, where in matrix notation 



%{0) 

1 
1 

+7^,(0) 



where '%j{yj) and ~jtj{yj) are functions that may (but need 
not) be chosen as fixed-point messages of the sum-product 
algorithm, and Aj = 7^,(0)^4,(0) + ~ftj{l)^j{l) is the 
determinant of Uj and Vj. Evidently the concatenation UjSjVj 
is the identity, so this replacement preserves the partition 
function. 

Now express every Sj as the sum of two matrices: 



1 


■ 1 


■ 


1 


■ 


" 


a; 








"a- 





1 



if there are n edges Yj, then the partition function of the 
original NFG can correspondingly be expressed as the sum 
of the partition functions of the 2" component NFGs. 

If the functions Vi(2/j) ^nd 'jijinj) are fixed-point mes- 
sages of the sum-product algorithm, then it turns out that 
the partition function of the "zero-order" component graph 
is the Bethe-Peierls partition function (at that fixed-point of 
the sum-product algorithm); that the partition function of any 
component graph with a "loose end" (a vertex of effective 
degree 1) is zero; and that the partition functions of the 
remaining component graphs (corresponding to "generalized 
loops," in which all vertices have effective degree 2 or more) 
are "small" multiples of the Bethe-Peierls partition function. 
Again, the full development will be given in a subsequent 
version of this paper 



C. Lagrange duality 

Structurally similar operations can be used to obtain the 
Lagrange duality results for normal graphs of Vontobel and 
Loeliger 1231 . ES), which are based on the Legendre transform 
of convex optimization theory. 

One interesting aspect of this development is that instead 
of sums of products, we consider minima over sums (i.e., the 
sum-product semiring over the reals R is replaced by the min- 
sum semiring over the extended real line M = MU {+oo}). 
Thus a partition function has the following form: 

Z(x) = min V /fc(xfc,yfc), x e A", 
y^y ^ — ' 

keK 

where the "factors" /fc(xfc,yfc) are R-valued. 

The dual functions under the Legendre transform are func- 
tions in the max-sum semiring. Dualization involves the inser- 
tion of sign inverters into edges, as with Fourier dualization. 
Again, details will be provided in future versions of this paper 
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